A Token-based Code Clone Detection Technique and Its Evaluation
نویسندگان
چکیده
A code clone is a code port ion in source f i les that is identical or similar to another. Since code clones generally reduce maintainabili ty of soft ware, several code clone detection techniques and tools have been proposed. This paper proposes a new clone detection technique, which consists of transformation of input source text and t o ken-bytoken comparison. Based on the proposed code clone detection technique, we developed a tool named CC Finder, which extracts code clones in C/C++ or Java source files. As well metrics for code clones were developed. In order to evaluate the usefulness of the tool and metrics, we con ducted several exper iments . As the resul ts , the tool found several subsystems in two operat ing sys tems, namely FreeBSD and Linux, that could be traced t o the same original. As well, the pro posed met r ics found interesting clones in a Java library, JDK.
منابع مشابه
CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code
ÐA code clone is a code portion in source files that is identical or similar to another. Since code clones are believed to reduce the maintainability of software, several code clone detection techniques and tools have been proposed. This paper proposes a new clone detection technique, which consists of the transformation of input source text and a token-by-token comparison. For its implementati...
متن کاملRobust Parsing of Cloned Token Sequences
Token-based clone detection techniques are known for their scalability, high recall, and robustness against syntax errors and incomplete code. They, however, may yield clones that are syntactically incomplete and they know very little about the syntactic structure of their reported clones. Hence, their results cannot immediately be used for automated refactorings or syntactic filters for releva...
متن کاملCross Language Higher Level Clone Detection- Between Two Different Object Oriented Programming Language Source Codes
Similar type of source codes or repetition of source codes in the software is known code clones. Clone detection technique is capable of identifying the similar type of source codes present in software applications. These code clones increases the fault and maintenance cost. New source codes obtained from another source code without any proper changes lead to error. Detection of code clones hel...
متن کاملAnalyzing the Robustness of Clone Detection Tools Regarding Code Obfuscation
Research has shown that 7% to 23% of a typical source code system consists of cloned code. Some clones are introduced intentionally, but a majority is unintenionally created. To find these clones, several code clone detection tools have been developed. They are used in several fields such as detection of software plagiarism, malware detection or code quality enhancing. However, this process is ...
متن کاملCCFinder: A Multilinguistic Token-Base Code Clone Detection System for Large Scale Source Code
A code clone is a code portion in source files that is identical or similar to another. Since code clones are believed to reduce the maintainability of software, several code clone detection techniques and lools have been proposed. This paper proposes a new clone detection technique, which consists of the transformation of input source text and a token·by·token comparison. For its implementatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001